NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Non-stochastic Budgeted Online Pricing with Semi-Bandit Feedback

https://doi.org/10.1609/aaai.v39i18.34089

Liu, Xiang; Chan, Hau; Li, Minming; Wu, Weiwei; Tran-Thanh, Long (April 2025, Proceedings of the AAAI Conference on Artificial Intelligence)

We consider a general non-stochastic online pricing bandit setting in a procurement scenario where a buyer with a budget wants to procure items from a fixed set of sellers to maximize the buyer's reward by dynamically offering purchasing prices to the sellers, where the sellers' costs and values at each time period can change arbitrarily and the sellers determine whether to accept the offered prices to sell the items. This setting models online pricing scenarios of procuring resources or services in multi-agent systems. We first consider the offline setting when sellers' costs and values are known in advance and investigate the best fixed-price policy in hindsight. We show that it has a tight approximation guarantee with respect to the offline optimal solutions. In the general online setting, we propose an online pricing policy, Granularity-based Pricing (GAP), which exploits underlying side-information from the feedback graph when the budget is given as the input. We show that GAP achieves an upper bound of O(n{v_{max}}{c_{min}}sqrt{B/c_{min}}ln B) on the alpha-regret where n, v_{max}, c_{min}, and B are the number, the maximum value, the minimum cost of sellers, and the budget, respectively. We then extend it to the unknown budget case by developing a variant of GAP, namely Doubling-GAP, and show its alpha-regret is at most O(n{v_{max}}{c_{min}}sqrt{B/c_{min}}ln2 B). We also provide an alpha-regret lower bound Omega(v_{max}sqrt{Bn/c_{min}}) of any online policy that is tight up to sub-linear terms. We conduct simulation experiments to show that the proposed policy outperforms the baseline algorithms.
more » « less
Free, publicly-accessible full text available April 11, 2026
Socialbots on Fire: Modeling Adversarial Behaviors of Socialbots via Multi-Agent Hierarchical Reinforcement Learning

https://doi.org/10.1145/3485447.3512215

Le, Thai; Tran-Thanh, Long; Lee, Dongwon (April 2022, In Proceedings of the ACM Web Conference 2022)

Full Text Available
To Ask or Not to Ask: A User Annoyance Aware Preference Elicitation Framework for Social Robots

Gucsi, Balint; Tarapore, Danesh S; Yeoh, William; Amato, Christopher; Tran-Thanh, Long (January 2020, Proceedings of the International Conference on Intelligent Robots and Systems)

Full Text Available
Deceiving Cyber Adversaries: A Game Theoretic Approach

Schlenker, Aaron; Thakoor, Omkar; Xu, Haifeng; Tambe, Milind; Vayanos, Phebe; Fang, Fei; Tran-Thanh, Long; Vorobeychik, Yevgeniy (January 2018, International Conference on Autonomous Agents and Multiagent Systems)

An important way cyber adversaries ind vulnerabilities in mod- ern networks is through reconnaissance, in which they attempt to identify coniguration speciics of network hosts. To increase un- certainty of adversarial reconnaissance, the network administrator (henceforth, defender) can introduce deception into responses to network scans, such as obscuring certain system characteristics. We introduce a novel game theoretic model of deceptive interac- tions of this kind between a defender and a cyber attacker, which we call the Cyber Deception Game. We consider both a powerful (rational) attacker, who is aware of the defender’s exact deception strategy, and a naive attacker who is not. We show that computing the optimal deception strategy is NP-hard for both types of attackers. For the case with a powerful attacker, we provide a mixed-integer linear program solution as well as a fast and efective greedy algo- rithm. Similarly, we provide complexity results and propose exact and heuristic approaches when the attacker is naive. Our exten- sive experimental analysis demonstrates the efectiveness of our approaches.
more » « less
Full Text Available

Search for: All records